Assessing Ranking Fidelity of Competition Structures via Weighted Mutual Information

Zach Culp, Josie Peterburs, Dr. Gregory J. Matthews, Dr. Ryan P.A. McShane

Good

Bad

Distribution of \(\hat{r}\)

Comparing Structures

  • Which competition structure is best?

    • What is considered “best”?

    • A good tournament: top k teams produced by the tournament matches closely to the true top k

Comparing Structures

  • Which competition structure is best?

    • What is considered “best”?

    • A good tournament: top k teams produced by the tournament matches closely to the true top k

  • Effectivity1: Interested in correctly ranking the best competitor as first

  • Efficacy2: Final rankings should match the true rankings of all the competitors

Effectivity and Efficacy

Effectivity and Efficacy

  • Salient ranks: output ranks of the tournament you want to get correct
    • Effectivity: 1 is only salient rank (k=1)
    • Efficacy: 1 through \(n\) are salient ranks (k=n)

Goal

  • Quantify how well a structure can accurately order the competitors using weighted mutual information
  • We propose a metric that quantifies how effectively different tournament structures recover the true ranking of the teams

Mutual Information

  • The notation for mutual information is: \[ I(X;Y) = \sum_{y \in Y}\sum_{x \in X} p(x,y) \log{\frac{p(x,y)}{p(x)p(y)}} \]

  • In our case, it can be simplified to:

    \[ I(r, \hat{r}) = n! \sum_{\hat{r}} p(r, \hat{r}) \log{\frac{p(r,\hat{r})}{(\frac{1}{n!})^2}} \]

Weighted Mutual Information

  • A limitation of mutual information is that it does not distinguish between the direction of information
  • Given equal probabilities, the information found from a ranking of \((1,2,3,4)\) is equivalent to \((4,3,2,1)\)
  • Because of this, we proposed a new weighting function
  • The final metric we used normalizes the results: \(\frac{I_w(r,\hat{r})}{H(r,\hat{r})}\)

Top K-teams Weighted Kendall

\[\tau_w = \frac{\sum_{i\neq j} W_{ij} \times \text{sgn}(i-j)\times \text{sgn}(R_i - R_j)}{\sum_{i,j} W_{ij} - \sum_i W_{ii}}\]

Tournament Simulations

  • Given the assigned “known strength”, \(\theta\), we simulate a game using: \(\Pr(i > j) = \frac{e^{\theta_i}}{e^{\theta_i} + e^{\theta_j}} = \frac{1}{1+e^{\theta_i - \theta_j}}\)
r theta r_hat Wins Losses
1 1.221 3 2 1
2 0.765 4 1 2
3 0.431 2 2 1
4 0.140 1 3 0
5 -0.140 7 0 1
6 -0.431 6 0 1
7 -0.765 5 0 1
8 -1.221 8 0 1

Results

Discussion

  • We used mutual information to analyze competition structures in a range from efficacy to effectivity.
  • Using the method, we found that Round Robin is the best structure, followed by Reseeded Single Elimination, Double Elimination, and Group Stage.
  • Limitations: The “known” ranks are assigned using \(\theta\) and, for some structure, the order of the ranks matter in the formatting of the structure
  • Future Work: Examine how the distribution of \(\theta\)s influence the score of each competition structure

References

  • Appleton, David R. 1995. “May the Best Man Win?” Statistician 44 (4): 529.
  • Csato, L. 2021. Tournament Design: How Operations Research Can Improve Sports Rules. 1st Ed. Cham, Switzerland: Palgrave Macmillan.
  • Devriesere, Csató, K., and D. Goossens. 2025. “Tournament Design: A Review from an Operational Research Perspective.” European Journal of Operational Research 324 (1): 1–21.
  • Glenn, W A. 1960. “A Comparison of the Effectiveness of Tournaments.” Biometrika 47 (3-4): 253–62.

References

  • Guiasu, Silviu. 1977. Information Theory with Applications. New York, NY: McGraw-Hill.
  • Johnson, Sidney, and Rodney Fort. 2022. “Match Outcome Uncertainty and Sports Fan Demand: An Agnostic Review and the Standard Economic Theory of Ports Leagues.” Int. J. Emp. Econ. 01 (02).
  • Lasek, Jan, and Marek Gagolewski. 2018. “The Efficacy of League Formats in Ranking Teams.” Stat. Modelling 18 (5-6): 411–35.
  • Sziklai, Balázs R, Péter Biró, and László Csató. 2022. “The Efficacy of Tournament Designs.” Comput. Oper. Res. 144 (105821): 105821.